Basic Computer Skills .htm
Basic Computer Skills .htm
About us Contact
post comment Teb Computer Academy
TEB  Computer Kids Academy
![](datasci123.png)
Data Science
Data Science - Statistics Correlation vs. Causality
Correlation Does Not Imply Causality
Correlation measures the numerical relationship between two variables.
As you can see, a high correlation coefficient (close to 1),
does not mean that we can for sure
conclude an actual relationship between two variables.
let us use this example:
During the summer, the sale of ice cream at a beach increases
Simultaneously, drowning accidents also increase as well
Does it mean that increase of ice cream sale is a direct
cause of increased drowning accidents?
The Beach Example in Python
Example
import pandas as pd
import matplotlib.pyplot as plt
Drowning_Accident = [20,40,60,80,100,120,140,160,180,200]
Ice_Cream_Sale = [20,40,60,80,100,120,140,160,180,200]
Drowning = {"Drowning_Accident": [20,40,60,80,100,120,140,160,180,200],
"Ice_Cream_Sale": [20,40,60,80,100,120,140,160,180,200]}
Drowning = pd.DataFrame(data=Drowning)
Drowning.plot(x="Ice_Cream_Sale", y="Drowning_Accident", kind="scatter")
plt.show()
correlation_beach = Drowning.corr()
print(correlation_beach)
Output
![](savingatat.png)
Correlation vs Causality - The Beach Example
In other words: can we use ice cream sale to predict drowning accidents?
The answer is - Probably not.
It is likely that these two variables are accidentally correlating with each other.
Note that there is an important difference between correlation and causality:
Correlation is a number that measures how closely the data are related
Causality is the conclusion that x causes y.
Advice: Always critically reflect over
the concept of causality when doing predictions!
Professional courses:
participant -Kids, Youths and Adults alike
learn and aquire these professional Courses
Professional Courses: